Parallelization method, parallelization tool, and in-vehicle device

ABSTRACT

A computer generates a parallel program, based on an analysis of a single program that includes a plurality of tasks written for a single-core microcomputer, by parallelizing parallelizable tasks for a multi-core processor having multiple cores. The computer includes a macro task (MT) group extractor that analyzes, or finds, a commonly-accessed resource commonly accessed by the plurality of tasks, and extracts a plurality of MTs showing access to such commonly-accessed resource. Then, the computer uses an allocation restriction determiner to allocate the extracted plural MTs to the same core in the multi-core processor. By devising a parallelization method described above, an overhead in an execution time of the parallel program by the multi-core processor is reduced, and an in-vehicle device is enabled to execute each of the MTs in the program optimally.

CROSS REFERENCE TO RELATED APPLICATION

The present application is based on and claims the benefit of priorityof Japanese Patent Application No. 2016-122769, filed on Jun. 21, 2016,the disclosure of which is incorporated herein by reference.

TECHNICAL FIELD

The present disclosure generally relates to a parallelization method anda parallelization tool respectively for generating a parallel programfor a multi-core microcomputer based on a single program for asingle-core microcomputer, and an in-vehicle device to which thegenerated parallel program is implemented.

BACKGROUND INFORMATION

A parallelization compile method disclosed in a patent document,Japanese Patent Laid-Open No. 2015-1807 (patent document 1) listedbelow, for example, serves as a parallelization method to generate aparallel program for a multi-core microcomputer based on a singleprogram for a single-core microcomputer.

In such parallelization compile method, an intermediate language isgenerated from a source code of the single program by performing alexical analysis and a syntax analysis, and, by using such anintermediate language, a dependency analysis and optimization and thelike of a plurality of macro tasks (i.e., unit processes hereafter) areperformed. Further, the parallelization compile method generates theparallel program based on a scheduling of the plurality of unitprocesses, which takes into account the dependency of each of the unitprocesses, and an execution time of each of the unit processes.

However, in a general embedded system, multiple tasks are executed in aswitching manner by a real time operating system (RTOS). In such case,even though the parallel program may be generated by parallelizing thosetasks, a synchronization process is required for parallelizing themultiple tasks, which makes it necessary for the RTOS to allocate aprocess time for the synchronization process.

That means, when the parallel program is relatively small, the processtime reduced by the parallelization of the multiple tasks is surpassedby an overhead time that is required by the synchronization process.Therefore, the benefit of parallelization may not necessarily be enjoyedby all multi-task programs. In other words, the relatively short processtime tasks are not suitably parallelized.

Further, the parallelization of the above-described tasks with the shotprocess time is not only difficult, but also is prone to an interferencewith other tasks, which makes it more difficult to execute in parallel,i.e., simultaneously.

For addressing parallelization of the short process time tasks,performing an inter-core exclusion process in addition to a relevantprocess may be one solution. However, the inter-core exclusion processhas a much greater overhead in comparison to an intra-core exclusionprocess used in the single core microcomputer, which may greatlydeteriorate the processing capacity of the multi-core microcomputer.

SUMMARY

It is an object of the present disclosure to provide a parallelizationmethod and a parallelization tool that are capable of generating aparallel program that suitably reduces an overhead in an execution timeof such program by the multi-core microcomputer, and an in-vehicledevice that is capable of suitably/optimally executing each of the unitprocesses in the parallel program.

In an aspect of the present disclosure, a parallelization method thatgenerates a parallel program for a multi-core microcomputer havingmultiple cores. The parallel program parallelizes parallelizable unitprocesses based on (i) an analysis of a single program for a single-coremicrocomputer that includes multiple tasks, and (ii) a dependencyrelationship derived from the analysis and indicative of an access to asame resource from the unit processes. The parallelization methodincludes an extraction procedure that extracts a plurality of the unitprocesses accessing the same resource based on an analysis of the sameresource commonly accessed by the multiple tasks, and an allocationprocedure allocating the plurality of the unit processes extracted bythe extraction procedure to a same core of the multi-core microcomputer.

The present disclosure is thus enabled to extract unit processes thataccess the same resource from among the unit processes included indifferent tasks, due to an analysis of a commonly-accessed resource andan extraction of a plurality of unit processes that access thecommonly-accessed resource.

Further, the present disclosure is enabled to reduce or eliminate theinter-core exclusion process due to the allocation of a plurality of theextracted unit processes to the same core of the multi-coremicrocomputer.

Therefore, the present disclosure is enabled to generate a parallelprogram that is capable of reducing an overhead in the execution time ofsuch program.

In another aspect of the present disclosure, a parallelization toolincluding a computer for generating a parallel program by parallelizingparallelizable unit processes for a multi-core microcomputer havingmultiple cores based on a dependency relationship indicative of anaccess to a same resource from plural unit processes of a singleprogram, according to an analysis of a single program including multipletasks for a single-core microcomputer. The parallelization tool includesan extractor extracting a plurality of the unit processes accessing thesame resource based on an analysis of the same resource commonlyaccessed by the multiple tasks; and an allocator allocating theplurality of the unit processes extracted by the extractor to a samecore of the multi-core microcomputer.

Therefore, the parallelization tool of the present disclosure is enabledto generate a parallel program that is capable of reducing an overheadin the execution time of such program, similarly to the above-describedparallelization method.

In yet another aspect of the present disclosure, an in-vehicle deviceincludes a multi-core microcomputer having multiple cores, and aparallel program parallelizing a plurality of unit process in a singleprogram for a single-core microcomputer, for processing multiple tasksby a single core. The parallel program is configured to (a) parallelizethe multiple tasks by parallelizing a plurality of parallelizable unitprocesses based on a data dependency indicative of an access to a sameresource from plural unit processes of the single program, according toan analysis of the single program including the multiple tasks for thesingle-core microcomputer, (b) extract the plural unit processes basedon indication of an access to the commonly-accessed resource, byanalyzing a commonly-accessed resource commonly accessed from themultiple tasks, and (c) allocate the extracted plural unit processes tothe same core of the multi-core microcomputer. Also, the multi-coremicrocomputer is configured to execute the unit processes assigned tothe multiple cores.

The in-vehicle device is provided with the multi-core microcomputer andthe parallel program generated in the above-described manner. Further,each of the multiple cores of the multi-core microcomputer executes theparallel program. That is, the in-vehicle device of the presentdisclosure executes the parallel program that reduces the overhead.Therefore, the in-vehicle device is capable of optimally executing eachof the unit processes.

BRIEF DESCRIPTION OF THE DRAWINGS

Objects, features, and advantages of the present disclosure will becomemore apparent from the following detailed description made withreference to the accompanying drawings, in which:

FIG. 1 is a block diagram of a configuration of a computer in anembodiment of the present disclosure;

FIG. 2 is a block diagram of a configuration of an in-vehicle device inthe embodiment of the present disclosure;

FIG. 3 is a block diagram of functions of the computer in the embodimentof the present disclosure;

FIG. 4 is a flowchart of a process of the computer in the embodiment ofthe present disclosure;

FIG. 5 is an illustration of a single program in the embodiment of thepresent disclosure;

FIG. 6 is an illustration of a processing order of MTs in each task inthe embodiment of the present disclosure;

FIG. 7 is an illustration of data dependency relationships in each taskin the embodiment of the present disclosure;

FIG. 8 is an illustration of a scheduling result of a first task in theembodiment of the present disclosure;

FIG. 9 is an illustration of a scheduling result of a second task in theembodiment of the present disclosure;

FIG. 10 is an illustration of a task switching among multiple cores andan allocation result of each MT in the embodiment of the presentdisclosure;

FIG. 11 is a block diagram of the functions of the computer in amodification of the present disclosure; and

FIG. 12 is a flowchart of the process of the computer in themodification of the present disclosure.

DETAILED DESCRIPTION

A couple of embodiments for carrying out the present disclosure aredescribed with reference to the drawing in the following.

In each of the plural embodiments, the same component/configuration hasthe same reference numeral as preceding embodiments, and the descriptionof the same component/configuration is not repeated. In each of theplural embodiments, when a part of a configuration is described, therest of the configuration may be borrowed from the precedingembodiments.

In the present embodiment, a computer 10 is adopted as a device, whichgenerates a parallel program 21 a 1 parallelized for a multi-coreprocessor 21 having a first core 21 c and a second core 21 d frommultiple unit processes in a single program 30 (e.g., a series C source)for a single-core microcomputer having only one core.

Further, in the present embodiment, an automatic parallelizationcompiler 1 for generating the parallel program 21 a 1 is adopted.Further, in the present embodiment, an in-vehicle device 20 providedwith the parallel program 21 a 1 that is generated by the computer 10 isadopted. Note that a processor may be designated as a microcomputer.Therefore, the multi-core processor may be restated as the multi-coremicrocomputer.

The automatic parallelization compiler 1 includes a procedure forgenerating the parallel program 21 a 1. Therefore, the automaticparallelization compiler 1 is equivalent to a parallelization method inthe claims.

Further, the automatic parallelization compiler 1 is a program includingthe parallelization method. Further, the computer 10 generates theparallel program 21 al by executing the automatic parallelizationcompiler 1.

Therefore, the computer 10 is equivalent to a parallelization tool inthe claims.

Note that the single program 30 includes multiple tasks 31 and 32, andis thus executed as switching execution of the multiple tasks 31 and 32by an embedded Real-Time Operating System (RTOS).

The computer 10 generates the parallel program 21 a 1 which parallelizeseach of the multiple tasks 31 and 32 in the single program 30.

The automatic parallelization compiler 1 is for generating the parallelprogram 21 a 1 which parallelizes each of the multiple tasks 31 and 32in the single program 30.

As a background for generating the parallel program 21 a 1, thefollowing problems such as an increase of a heat generation amount froma processor, an increase of an electric power consumption by theprocessor, a maximum clock frequency of the processor, and an increaseof popularity of the multi-core processor. That is, even in a field ofan in-vehicle device, the use of the multi-core processor 21 is now in amain stream, which makes it necessary to adapt control programs to themulti-core processor.

Further, the parallel program 21 al must be developed in low cost in ashort term, and must have high reliability and high performance and highprocess execution speed.

When generating the parallel program 21 a 1, a data dependencyrelationship of the multiple unit processes of the single program 30 isanalyzed, and the multiple unit processes are allocated, orassigned/arranged, to the different cores 21 c and 21 d of themulti-core processor 21. Regarding details of such allocation, pleaserefer to a patent document JP 2015-1807 A.

In the present embodiment, the single program 30 written in C languageis adopted, for an example. However, the single program 30 may also bewritten in a different programming language other than C language.

The above-mentioned unit process may be restated as a processing block,a macro task, or the like. In the following, the unit process may alsobe designated as Macro Task (MT).

According to the present embodiment, as shown in FIG. 5 etc., an 11thMT-a 13th MT, and a 21st MT-a 23rd MT are adopted as an example.

Each of the MTs includes at least one instruction that is executable bythe first core 21 c and the second core 21 d.

Here, the single program 30 adopted in the present embodiment isdescribed with reference to FIG. 3.

A series C source 30 of FIG. 3 is equivalent to the single program 30 inthe claims. Further, in FIG. 3, a task switching process in thesingle-core microcomputer and a MT processing order in the single-coreare illustrated.

The single program 30 includes a first task 31 and a second task 32,and, at timings of each of wake-ups (WKUP in FIG. 5), for example, thefirst task 31 and the second task 32 are switched and executed.

Note that, in FIGS. 5-10, for the distinction among different tasks, theMT of the first task 31 and the MT of the second task 32 have differenthatchings from each other.

As shown in FIG. 6, the 11th MT—the 13th MT are included in the firsttask 31. The processing order of those MTs in the first task 31 is fromthe 11th MT, to the 12th MT, and to the 13th MT.

On the other hand, the 21st MT—the 23rd MT are included in the secondtask 32. The processing order of those MTs in the second task 32 is fromthe 21st MT, to the 22nd MT, and to the 23rd MT.

Note that, in the present embodiment, the first task 31 is assumed tohave a higher priority higher than the second task 32.

The multiple MTs include Inter-dependent MTs, among which two MTs dependon the same data, or an update of the two MTs is associated by way ofthe same data. That may also be designated as having a data dependencyrelationship, or more simply as having a data dependency in the drawing.

According to the present embodiment, the data dependency relationship isfound/established among the 11th MT and the 12th MT, among the 11th MTand the 13th MT, and among the 13th MT and the 21st MT. In FIG. 7, theinter-dependent MTs having a data dependency relationship with eachother are connected by an arrow.

The data dependency relationship is a relationship of accessing the samedata from/by the two MTs. The data dependency relationship may also beunderstood as an access of each of the two MTs to the same resource.

Therefore, the two MT with the data dependency relationship show accessto the common resource. Thus, two or more data dependency relationshipsare found in the single program 30 including multiple MTs. In thefollowing, the same resource to which the multiple MTs access may alsobe designated as the common resource.

Further, the data dependency relationship is categorized into the firstto third cases.

The first case is a relationship in which a first MT writes to the data(Write) and a second MT reads from the same data (Read). The second caseis a relationship in which the first MT and the second MT respectivelywrite to the same data. The third case is a relationship in which thefirst MT reads from the data and the second MT writes to the same data.Execution sequence (i.e., processing order) of the first MT is prior tothe second MT in the single program 30. The first MT and the second MTare example MTs used in order to explain/illustrate the data dependencyrelationship.

Further, the single program 30 includes the intra-core exclusionprocess, in order to avoid interference among two MTs that are includedin different tasks and access the same data.

Therefore, the single-core microcomputer performs the intra-coreexclusion process.

According to the present embodiment, as shown in FIG. 5, in order toavoid interference between the 13th MT of the first task 31 and the 21stMT of the second task 32, an example of including the intra-coreexclusion process is adopted. The intra-core exclusion process is aprocess that performs a task interruption prohibition, and a taskinterruption permission.

For example, when the first task 31 is being performed, an Interruptionof the second task 32 is prohibited, and upon completing execution ofthe first task 31, an interruption of the second task 32 is permitted.

Further, the parallel program in which two inter-dependent MTs with thedata dependency relationship are allocated to the different cores mayinclude the inter-core exclusion process, in order to avoid interferencebetween the different cores.

However, the inter-core exclusion process has a greater overhead incomparison to the intra-core exclusion process, and is a major factor ofcapacity deterioration.

Here, the configuration of the computer 10 is described with referenceto FIG. 1 and FIG. 3.

The computer 10 is provided with a display 11, a Hard Disk Drive (HDD)12, a Central Processing Unit (CPU) 13, a Read-Only Memory (ROM) 14, aRandom Access Memory (RAM) 15, an input device 16, a reader 17, and thelike. The computer 10 is capable of reading a memory content memorizedby a memory medium 18. The automatic parallelization compiler 1 ismemorized by the memory medium 18. Therefore, the computer 10 is capableof performing the automatic parallelization compiler 1 memorized by thememory medium 18, and generates the parallel program 21 a 1.

Regarding the configuration of the computer 10 and the memory medium 18,please refer to a personal computer 100 and a memory medium 180disclosed in a patent document JP 2015-1807 A.

The automatic parallelization compiler 1 includes, in addition to whatis disclosed by JP 2015-1807 A, an MT group extractor 2 d, an allocationrestriction determiner 2 e, and the like.

Further, the computer 10 is provided with, as function blocks, blocksfor the first task 31, and blocks for the second task 32, as shown inFIG. 3. The computer 10 generates a post-parallelization first task 31 aby using the function block for the first task 31, and generates apost-parallelization second task 32 a by using the function block forthe second task 32. Then, the computer 10 soft-combines thepost-parallelization first task 31 a and the post-parallelization secondtask 32 a to generate the parallel program 21 a 1. Note that theparallel program 21 a 1 may also be considered as a parallelized Csource.

In the present embodiment, the computer 10, which is capable ofobtaining core-dependency information 41, is adopted.

The core-dependency information 41 is equivalent to core allocationInformation in the claims, and is information which shows which one ofmany MTs in the single program 30 is specified as being allocated to thecore 21 c or to the core 21 d (i.e., information indicative ofallocation destination of MT in the single program 30). Note that anallocation destination specified MT among the MTs in the single program30 may be considered as dependent on the core specified as theallocation destination.

The function blocks for the first task 31 include a first accessanalyzer 1 a, a first data dependency analyzer 1 b, a first coredependency determiner 1 c, a first scheduler 1 d, and the like.

The first access analyzer 1 a analyzes an access resource of each MT inthe first task 31. That is, the first access analyzer 1 a extracts aresource (i.e., data) which each MT in the first task 31 accesses.

The first data dependency analyzer 1 b analyzes a data dependencyrelationship of each MT in the first task 31. The first data dependencyanalyzer 1 b analyzes the data dependency relationship, and extractsparallelizable MT which can be parallelized.

The first core dependency determiner 1 c determines (i) an allocationdestination specified MT about which an allocation destination isspecified among MTs in the first task 31 and (ii) an allocationdestination of such MT based on the core-dependency information 41.

The first scheduler 1 d performs core allocation and scheduling. Thefirst scheduler 1 d allocates, while performing scheduling, each MT inthe first task 31 to the first core 21 c or to the second core 21 d.Then, the computer 10 optimally generates the post-parallelization firsttask 31 a which is optimized.

The function blocks for the second task 32 include a second accessanalyzer 2 a, a second data dependency analyzer 2 b, a second coredependency determiner 2 c, an MT group extractor 2 d, an allocationrestriction determiner 2 e, a second scheduler 2 f, and the like.

The second access analyzer 2 a analyzes an access resource of each MT inthe second task 32. That is, the second access analyzer 2 a extracts aresource which each MT in the second task 32 accesses.

The second data dependency analyzer 2 b analyzes a data dependencyrelationship of each MT in the second task 32. The second datadependency analyzer 2 b analyzes the data dependency relationship, andextracts parallelizable MT which can be parallelized.

The second core dependency determiner 2 c determines (i) an allocationdestination specified MT about which an allocation destination isspecified among MTs in the second task 32 and (ii) an allocationdestination of such MT based on the core-dependency information 41.

The MT group extractor 2 d extracts an MT group, or a group of MTs,which show access to the common resource from the different tasks (i.e.,an extractor in the claims).

That is, the MT group extractor 2 d analyzes a resource accessed incommon from the different tasks, and extracts the multiple MTs, e.g.,two MTs, which show access to the common resource. Therefore, theextracted MTs show access to the common resource and are included indifferent tasks.

Note that the computer 10 performs the function of the MT groupextractor 2 d by executing the automatic parallelization compiler 1.Therefore, the MT group extractor 2 d may be equivalent to an extractionprocedure in the claims.

The allocation restriction determiner 2 e determines an allocationrestriction of each MT in the second task 32, based on a resultscheduling of the first scheduler 1 d, a determination result of thesecond core dependency determiner 2 c, and the extracted MT group (i.e.,an allocator in the claims).

That is, the allocation restriction determiner 2 e determinesrestrictions at the time of allocating each MT in the second task 32 tothe first core 21 c or to the second core 21 d.

If it is described in full details, the allocation restrictiondeterminer 2 e allocates the multiple MTs extracted by the MT groupextractor 2 d to the same core in the multi-core processor 21 inconsideration of the scheduling result of the first scheduler 1 d andthe determination result of the second core dependency determiner 2 c.

Note that the allocation restriction determiner 2 e gives priority tothe allocation destination specified by the core-dependency information41, for allocating the multiple MTs extracted by the MT group extractor2 d to the same core in the multi-core processor 21.

That is, the allocation restriction determiner 2 e allocates themultiple MTs extracted by the MT group extractor 2 d to the same core inthe multi-core processor 21 within limits which do not change theallocation destination specified by the core-dependency information 41.

Note that, since the computer 10 executes the automatic parallelizationcompiler 1 to realize a function of the allocation restrictiondeterminer 2 e, the allocation restriction determiner 2 e is consideredas equivalent to an allocation procedure in the claims.

Further, the second scheduler 2 f performs core allocation andscheduling. The second scheduler 2 f allocates, while performingscheduling, each MT in the second task 32 to the first core 21 c or tothe second core 21 d. Then, the computer 10 optimally generates thepost-parallelization second task 32 a, which is optimized.

Here, the processing operation of the computer 10 is described withreference to FIG. 4. Note that each of the steps S10-S16 shown below maybe considered as equivalent to a procedure of the automaticparallelization compiler 1.

In Step S10, a processing object task is determined. That is, thecomputer 10 scans each task one by one, and determines the processingobject task.

In Step S11, a resource is extracted. That is, the computer 10 extractsa resource which each MT in the processing object task accesses.

Step S11 may be considered as a process which the first access analyzer1 a and the second access analyzer 2 a respectively perform.

In Step S12, a data dependency relationship is analyzed. That is, thecomputer 10 analyzes the data dependency relationship among each of theMTs in the processing object task.

Step S12 may be considered as a process which the first data dependencyanalyzer 1 b and the second data dependency analyzer 2 b respectivelyperform.

In Step S13, a core dependency of each MT is determined. That is, whenthere is the core-dependency information 41 implemented for functionalsafety reasons, for example, which specifies a core that executes acertain MT, the computer 10 determines the core dependency of each MTaccording to the core-dependency information 41.

Step S13 may be considered as a process in which the first coredependency determiner 1 c and the second core dependency determiner 2 crespectively perform.

In Step S14, the allocation restriction of each MT is determined. Thatis, the computer 10 extracts an MT group which show access to the commonresource from the different tasks, and determines the allocationrestriction of each MT within limits which do not violate thecore-dependency information 41, when each MT has the core dependency.

That is, the computer 10 allocates the multiple MTs, which show accessto the common resource from the different tasks, to the same core,without violating the core-dependency information 41.

According to the present embodiment, the 13th MT of the first task 31and the 21st MT of the second task 32 are allocated to the first core 21c.

Step S14 may be considered as a process which the MT group extractor 2 dand the allocation restriction determiner 2 e respectively perform.

In Step S15, core allocation and scheduling are performed. That is, thecomputer 10 performs core allocation and scheduling about each MT in theprocessing object task according to the allocation restriction and thedata dependency relationship.

Step S15 may be considered as a process in which the first scheduler idand the second scheduler 2 f respectively perform.

Note that the computer 10 performs Steps S10-S15 by treating each of thetwo tasks 31 and 32 as a processing object task. However, Step S14 mustbe performed for only one of the two tasks 31 and 32. Therefore, thecomputer 10 does not perform Step S14 during an execution of one of thetwo tasks 31 and 32. That is, for example, during an execution of thefirst task 31, which is determined as a process execution task, Step S14is not performed.

Then, the computer 10 performs core allocation and scheduling in StepS15 for the first task 31 according to (i) an analysis result of thedata dependency relationship and (ii) a determination result that isbased on the core-dependency information 41.

Then, the computer 10 determines the next task, i.e., the second task32, as a process execution task. That is, for the second task 32, thecomputer 10 performs Step S14, and performs core allocation andscheduling in Step S15 according to the allocation restriction and thedata dependency relationship.

In Step S16, it is determined about all the tasks whether the task hasbeen processed. That is, when there is any task which has not undergoneSteps S10-S15 among all the tasks, the computer 10 determines that sucha task is not processed, and the process returns to Step S10.

When there is no task which has not undergone Steps S10-S15 among allthe tasks, the computer 10 determines that all tasks have beenprocessed, and ends the process of FIG. 4.

That is, the computer 10 performs Steps S10-S15 for each of all tasks asa processing object in order. Further, in other words, the computer 10performs core allocation and scheduling in order for each task.

The computer 10 generates the parallel program 21 a 1 shown in FIG. 10from the single program 30 shown in FIG. 5 by performing the process insuch manner. By performing scheduling for each MT of the first task 31as shown in FIG. 8, the computer 10 allocates all MTs to the first core21 c, which is the post-parallelization the first task 31 a.

Further, by performing scheduling for each MT of the second task 32 asshown in FIG. 9, the computer 10 allocates the 21st MT having a datadependency relationship with the 13th MT to the same first core 21 c towhich the 13th MT is allocated.

Further, the computer 10 soft-combines the post-parallelization firsttask 31 a and the post-parallelization second task 32 a, and generatesthe parallel program 21 a 1 shown in FIG. 10.

Further, the computer 10 allocates the 13th MT of the first task 31 andthe 21st MT of the second task 32 with the data dependency relationshipto the same core.

Therefore, interference among the different tasks is prevented by thecomputer 10 by implementing a measure such as an intra-core exclusionprocess which may prohibit interruption or the like.

The intra-core exclusion process is basically added and included in aprogram design before parallelization. However, the intra-core exclusionprocess may also be added based on an analysis result as needed, and theparallel program 21 a 1 is generated accordingly (i.e., an allocator inthe claims).

Here, as shown by a two-dot chain line of FIG. 10, the computer 10 addsthe intra-core exclusion process to the 13th MT of the first task 31 andto the 21st MT of the second task 32.

Note that the intra-core exclusion process is included in the singleprogram 30 from the beginning. Further, the automatic parallelizationcompiler 1 is considered as including an allocation procedure.

As described above, the computer 10 analyzes a commonly-accessedresource, and extracts the multiple MTs that show access to such commonresource, thereby enabled to extract an MT which accesses the commonresource among MTs included in the different tasks.

Further, since the computer 10 allocates the multiple extracted MTs tothe same core in the multi-core processor 21, thereby enabled to reduceor eliminate the inter-core exclusion process.

Therefore, the computer 10 is capable of generating the parallel program21 al which can reduce the overhead of the execution time when executedby the multi-core processor 21. In addition to the above, the computer10 is capable of generating the parallel program 21 a 1 which has moreroom, i.e., can afford, to add/accommodate new processes than a programthat includes an inter-core exclusion process, as shown in FIG. 10.

Note that, since the computer 10 generates the parallel program byperforming the automatic parallelization compiler 1, the automaticparallelization compiler 1 has the same effects as the computer 10.

Further, even if the computer 10 is not provided with the first coredependency determiner 1 c and the second core dependency determiner 2 c,the computer 10 can still achieve the same object. Therefore, thecomputer 10 does not need to perform Step S13.

In such case, the computer 10 allocates the multiple MTs which showaccess to the common resource from the different tasks to the same coreregardless of the core dependency in Step S14.

Next, the configuration of the in-vehicle device 20 is described.

The in-vehicle device 20 includes the multi-core processor 21, acommunicator 22, a sensor 23, and an input/output port 24, as shown inFIG. 2. The multi-core processor 21 is provided with a ROM 21 a, a RAM21 b, the first core 21 c, and the second core 21 d.

The in-vehicle device 20 is applicable to an engine control device, ahybrid controlling device, and the like, which are disposed in avehicle, for example. However, the parallel program 21 a 1 is notlimited to such. The core may also be designated as a processor element.

The first core 21 c and the second core 21 d execute the parallelprogram 21 a 1 for performing an engine control, a hybrid control, andthe like. That is, the in-vehicle device 20 performs an engine control,a hybrid control, and the like, by using the first core 21 c and thesecond core 21 d which respectively execute MT assigned either to thefirst core 21 c or to the second core 21 d.

Thus, the in-vehicle device 20 is provided with the multi-core processor21 and the parallel program 21 a 1 generated as mentioned above.

Further, in the multi-core processor 21, each of the cores 21 c and 21 dexecute the parallel program 21 a. That is, the in-vehicle device 20executes the parallel program 21 a 1 in which an overhead is reduced.

Therefore, the in-vehicle device 20 can perform each MT in an optimalmanner.

The overhead may cause capacity deterioration of the multi-coreprocessor 21. Therefore, in other words, the multi-core processor 21executes the parallel program 21 a 1, in which the overhead is reduced,thereby reducing the capacity deterioration.

Please refer to a RAM 420, a communications part 430, a sensor part 450,and an input/output port 460 which are disclosed in the patent documentNo. JP 2015-1807 A about the details of the RAM 21 b, the communicator22, the sensor 23, and the input/output port 24.

When the multiple MTs, which respectively show access to the commonresource from the different tasks are extracted, the computer 10 cannotalways allocate the multiple MTs to the same core. In such case, thecomputer 10 allocates the multiple extracted MTs to the different cores.That is, the computer 10 divisibly allocates the multiple extracted MTsto the first core 21 c and to the second core 21 d.

Then, in order to avoid that the first core 21 c and the second core 21d respectively perform one of the multiple MTs and respectively accessthe common resource simultaneously, the computer 10 may add theinter-core exclusion process for generating the parallel program 21 al(i.e., an adder in the claims). Note that, as the inter-core exclusionprocess, a semaphore or the like is employable, for example.

Therefore, in other words, it may be described that the computer 10generates the parallel program 21 a 1 including an inter-core exclusionprocess. That is, the computer 10 adds a process forcontrolling/avoiding a competition (i.e., interference) for accessingthe common resource among the different cores to which the multipleextracted MTs have been allocated, for generating the parallel program21 al.

Therefore, the computer 10 is capable of generating the parallel program21 al with which the competition to the common resource by/among thefirst core 21 c and the second core 21 d is reduced, even when themultiple MTs respectively showing access to the common resource from thedifferent tasks have been allocated to two different cores 21 c and 21d.

Note that the automatic parallelization compiler 1 can achieve the sameeffects as the computer 10. Further, the adder serves as an additionprocedure of the automatic parallelization compiler 1.

Further, when divisibly allocating the multiple MTs, which respectivelyshow access to the common resource from the different tasks to the firstcore 21 c and to the second core 21 d, the computer 10 may add aninterruption process for generating the parallel program 21 a 1 (i.e.,an adder in the claims).

That is, in order to avoid that the first core 21 c and the second core21 d respectively perform one of the multiple MTs, and respectivelyaccess the common resource simultaneously, the computer 10 may add theinterruption process to interrupt an execution of the other MT during anexecution of the one of the multiple MTs.

Therefore, in other words, it may be described that the computer 10generates the parallel program 21 al including an interruption process.That is, the computer 10 adds a process for controlling/avoiding thecompetition (i.e., interference) for accessing the common resource amongthe different cores to which the multiple extracted MTs have beenallocated, for generating the parallel program 21 a 1.

Thus, the computer 10 is capable of generating the parallel program 21 a1 with which the competition to the common resource by/among the firstcore 21 c and the second core 21 d is reduced, even when the multipleMTs respectively showing access to the common resource from thedifferent tasks have been allocated to the two different cores 21 c and21 d.

Note that the automatic parallelization compiler 1 can achieve the sameeffects as the computer 10. Further, the adder in the claims isequivalent to the additional procedure of the automatic parallelizationcompiler 1.

Further, the above-described configuration in the present embodiment maybe applicable even to the single program 30 including the three or moretasks. In such case, the computer 10 has to have an increased number offunction blocks according to the number of the tasks. However, regardingthe function block that includes the MT group extractor 2 d and thelike, the computer 10 needs to have only one such function block.

Although the present disclosure has been described in connection withpreferred embodiment thereof with reference to the accompanyingdrawings, it is to be noted that various changes and modifications willbecome apparent to those skilled in the art.

(Modification)

The modification of the present disclosure is described with referenceto FIG. 11 and FIG. 12.

The modification is different from the computer 10 in that a computer 10a in the modification performs core allocation and scheduling at onetime for all the tasks.

In the modification, the single program 30 includes three tasks, a firsttask 33, a second task 34, and a third task 35. Note that, however, inthe modification, the single program 30 may also include two tasks orthe single program 30 may also include four or more tasks.

The computer 10 a generates the parallel program by executing theautomatic parallelization compiler of the modification.

As shown in FIG. 11, the computer 10 includes the first access analyzer1 a and the first data dependency analyzer 1 b as the function blocksfor the first task 33, and includes the second access analyzer 2 a andthe second data dependency analyzer 2 b as the function blocks for thesecond task 34. Further, the computer 10 includes a third accessanalyzer 3 a and a third data dependency analyzer 3 b as the functionblocks for the third task 35.

The third access analyzer 3 a is the same as the first access analyzer 1a or the second access analyzer 2 a. The third data dependency analyzer3 b is the same as the first data dependency analyzer 1 b or the seconddata dependency analyzer 2 b.

Further, the computer 10 includes a core dependency determiner 3, anallocation restriction determiner 4, and an optimizer 5 each as a commonfunction block.

The core dependency determiner 3 is the same as the first coredependency determiner 1 c or the second core dependency determiner 2 c.

The allocation restriction determiner 4 is equivalent to the MT groupextractor 2 d and the allocation restriction determiner 2 e (i.e., anallocator in the claims).

The allocation restriction determiner 4 determines the core allocationrestriction that is determined based on the core dependency and theaccess resource of each task. That is, the allocation restrictiondeterminer 4 extracts an MT group that shows access to the commonresource from the different tasks.

Then, the allocation restriction determiner 4 determines the allocationrestriction of each MT based on the determination result of the coredependency determiner 3 and the extracted MT group.

That is, the allocation restriction determiner 4 allocates the multipleextracted MTs to the same core in the multi-core processor 21 inconsideration of the determination result of the core dependencydeterminer 3.

Further, the allocation restriction determiner 4 allocates the multipleextracted MTs to the same core in the multi-core processor 21 within thelimits which do not change the allocation destinations specified by thecore dependency determiner 3.

Note that the allocation restriction determiner 4 may add the intra-coreexclusion process, in order to avoid the interference between themultiple extracted tasks, when the multiple extracted MTs have beenallocated to the same core in the multi-core processor 21, just like thecomputer 10 (i.e., an allocator in the claims).

As mentioned above, the automatic parallelization compiler in themodification may be described as including an allocation procedure inthe claims.

The optimizer 5 is a function block that performs a temporary coreallocation, scheduling, and an optimization. The optimizer 5 isequivalent to the first scheduler 1 d and to the second scheduler 2 f.

The optimizer 5 performs the temporary core allocation about MTs of eachtask that does not have the core allocation restriction, and performsscheduling for each task, and optimizes a process balance of each task.

In the following, the processing operation of the computer 10 a isdescribed with reference to FIG. 12. Note that each of the followingsteps S20-S29 is equivalent to a procedure of the automaticparallelization compiler 1 in the modification.

Step S20-Step S22 are the same as Step S10-Step S12.

In Step S23, it is determined about all the tasks whether the task hasbeen processed in terms of Steps S20-S22.

When there is any task which has not undergone any one of Steps S20-S22among all the tasks, the computer 10 determines that the processing ofthe tasks is not yet complete, and the process returns to Step S20.

When there is no task which has not undergone Steps S20-S22 among allthe tasks, the computer 10 determines that the processing of all thetasks is complete, and the process proceeds to Step S24.

In Step S24, core dependency of each MT is determined.

The computer 10 a may determine the core dependency of each MT accordingto the core-dependency information 41, which specifies that a certain MTmust be executed by a certain core due to functional safety reasons, forexample. Step S24 may be considered as a process which is executed bythe core dependency determiner 3.

In Step S25, the core allocation of each MT is determined.

The computer 10 a determines, regarding an MT whose allocation to acertain core is determinable according to an existing core dependency,the core allocation of each MT of all the tasks.

Further, the computer 10 a allocates, to the same core, an MT groupwhich accesses the common resource in each task.

Step S25 may be considered as a process which is executed by theallocation restriction determiner 4.

In Step S26, each of not-yet allocated MTs is temporarily allocated to acertain core (i.e., a temporary core allocation).

The computer 10 a temporarily allocates an MT which has not beenallocated to any core in Step S25 to an arbitrary core.

In Step S27, an execution sequence of each MT is determined (i.e.,scheduling).

The computer 10 a determines an execution sequence of each MT in eachtask.

In Step S28, an evaluation function is calculated.

The computer 10 a calculates an evaluation function, in a state thateach MT is temporarily allocated. The evaluation function is an indexfor a degree of optimization of the core allocation.

The evaluation function may be, for example, formulated as a division of(sum total of processing time of each core) divided by (a maximumprocessing time of each core). That is, the evaluation function may takea following form.Evaluation function=Σ(A×B)/max(A×B)

where A: {execution frequency of each task}

-   -   B: {processing time of each task by each core}

In Step S29, it is determined whether the evaluation function takes themaximum value. That is, the computer 10 a determines whether the valueobtained by calculating the evaluation function is the maximum value.

When the evaluation function is not determined as maximized, thecomputer 10 a interprets that the process balance in each task has notbeen optimized, and the process returns to Step S26.

When the evaluation function is determined as maximized, the computer 10a interprets that the process balance in each task is optimized, andends the process of FIG. 12. That is, the optimization of processbalance is achieved when the evaluation function for evaluating theallocation of MTs to the cores is maximized.

In other words, for the optimization of the process balance, one of theallocations of MTs to the cores that maximize the value of theevaluation function may be picked up.

However, since the evaluation scale may soar/skyrocket as the number ofMTs increases, a certain efficient search algorithm may preferably beused. Note that an efficient search algorithm is not a necessaryrequirement, however.

As described above, by repeating Steps S26-S29, the computer 10 aoptimizes a process balance in each task, i.e., optimizes the coreallocation of each MT. That is, the computer 10 a optimizes the processbalance in each task, by replacing the core allocation of each MT thatis performed via temporary allocation of MT to the cores. These stepsS26-S29 is considered as a process performed by the optimizer 5.

The computer 10 a generates a post-parallelization first task 33 a, apost-parallelization second task 34 a, and a post-parallelization thirdtask 35 a from the single program 30 by performing the process in theabove-described manner. Then, the computer 10 a soft-combines thepost-parallelization first task 33 a to the post-parallelization thirdtask 35 a to generate the parallel program 21 al.

The computer 10 a achieves the same effects as the computer 10. Further,the automatic parallelization compiler in the modification also achievesthe same effects as the computer 10 a.

Note that, the computer 10 a can achieve the same object without havingthe core dependency determiner 3. Therefore, the computer 10 a does notneed to perform Step S24.

In such case, the computer 10 a (i.e., the allocation restrictiondeterminer 4) allocates, in Step S25, the multiple MTs whichrespectively show access to the common resource from the different tasksto the same core regardless of the core dependency.

Such changes, modifications, and summarized schemes are to be understoodas being in the scope of the present disclosure as defined by appendedclaims.

What is claimed is:
 1. A parallelization method that generates a parallel program for a multi-core microcomputer having multiple cores, the parallel program parallelizing parallelizable unit processes based on (i) an analysis of a single program for a single-core microcomputer that includes multiple tasks, and (ii) a dependency relationship derived from the analysis and indicative of an access to a same resource from the unit processes, the parallelization method comprising: an extraction procedure extracting a plurality of the unit processes accessing the same resource based on an analysis of the same resource commonly accessed by the multiple tasks; and an allocation procedure allocating the plurality of the unit processes extracted by the extraction procedure and included in different tasks to access the same resource to a same core of the multi-core microcomputer to prevent from generating an interference among the multiple cores so that the plurality of the unit processes includes an intra-core exclusion process, wherein the interference is caused by the same resource commonly accessed by the multiple tasks; and the intra-core exclusion process is a process that performs a task interruption prohibition, and a task interruption permission.
 2. The parallelization method of claim 1, wherein the allocation procedure adds another intra-core exclusion process to the plurality of the unit processes that have been extracted by the extraction procedure, and have been allocated to a same core of the multi-core microcomputer.
 3. The parallelization method of claim 1, wherein the allocation procedure allocates the plurality of the unit processes extracted by the extraction procedure to different cores of the multi-core microcomputer, when the plurality of the unit processes extracted by the extraction procedure cannot be allocated to the same core of the multi-core microcomputer, and the parallelization method further comprises an addition procedure adding an inter-core exclusion process that prevents an access to the same resource from the different cores that respectively execute each of the plurality of the unit processes in case that the allocation procedure has allocated the plurality of the unit processes extracted by the extraction procedure to the different cores of the multi-core microcomputer.
 4. The parallelization method of claim 1, wherein the allocation procedure allocates the plurality of the unit processes extracted by the extraction procedure to different cores of the multi-core microcomputer, when the plurality of the unit processes extracted by the extraction procedure cannot be allocated to the same core of the multi-core microcomputer, and the parallelization method further comprises an interruption procedure adding an interruption process that prevents an access to the same resource from the different cores that respectively execute each of the plurality of the unit processes, by interrupting rest of the plurality of the unit processes during the access to the same resource by one of the plurality of the unit processes, when the allocation procedure allocates the plurality of the unit processes extracted by the extraction procedure to the different cores of the multi-core microcomputer.
 5. The parallelization method of claim 1, wherein core allocation information indicative of allocation destination of each of the plurality of the unit processes of the single program is obtained to allocate the plurality of the unit processes to the multiple cores of the multi-core microcomputer, and the allocation procedure prioritizes the core allocation information in allocating the plurality of the unit processes to specified cores, and allocates the plurality of the unit processes extracted by the extraction procedure to the same core of the multi-core microcomputer.
 6. A parallelization tool including a computer for generating a parallel program by parallelizing parallelizable unit processes for a multi-core microcomputer having multiple cores based on a dependency relationship indicative of an access to a same resource from plural unit processes of a single program, according to an analysis of a single program including multiple tasks for a single-core microcomputer, the parallelization tool comprising: an extractor extracting a plurality of the unit processes accessing the same resource based on an analysis of the same resource commonly accessed by the multiple tasks; and an allocator allocating the plurality of the unit processes extracted by the extractor and included in different tasks to access the same resource to a same core of the multi-core microcomputer to prevent from generating an interference among the multiple cores so that the plurality of the unit processes includes an intra-core exclusion process, wherein the interference is caused by the same resource commonly accessed by the multiple tasks; and the intra-core exclusion process is a process that performs a task interruption prohibition, and a task interruption permission.
 7. The parallelization tool of claim 6, wherein the allocator adds another intra-core exclusion process to the plurality of the unit processes that have been extracted by the extractor, and have been allocated to the same core of the multi-core microcomputer.
 8. The parallelization tool of claim 6, wherein the allocator allocates the plurality of the unit processes extracted by the extractor to different cores of the multi-core microcomputer, when the plurality of the unit processes extracted by the extractor cannot be allocated to the same core of the multi-core microcomputer, and the parallelization device further comprises an adder adding an inter-core exclusion process that prevents an access to the same resource from the different cores that respectively execute the plurality of the unit processes in case that the allocator has allocated the plurality of the unit processes extracted by the extractor to the different cores of the multi-core microcomputer.
 9. The parallelization tool of claim 6, wherein the allocator allocates the plurality of the unit processes extracted by the extractor to different cores of the multi-core microcomputer, when the plurality of the unit processes extracted by the extractor cannot be allocated to the same core of the multi-core microcomputer, and the parallelization tool further comprises an interrupter adding an interruption process that prevents an access to the same resource from the different cores that respectively execute each of the plurality of the unit processes by interrupting rest of the plurality of the unit processes during the access to the same resource by one of the plurality of the unit processes, when the allocator allocates the plurality of the unit processes extracted by the extractor to the different cores of the multi-core microcomputer.
 10. The parallelization tool of claim 6, wherein core allocation information indicative of allocation destination of each of the plurality of the unit processes of the single program is obtained by the computer to allocate the plurality of the unit processes to the multiple cores of the multi-core microcomputer, and the allocator prioritizes the core allocation information in allocating the plurality of the unit processes to specified cores, and allocates the plurality of the unit processes extracted by the extractor to the same core of the multi-core microcomputer.
 11. An in-vehicle device comprising: a multi-core microcomputer having multiple cores; and a parallel program parallelizing a plurality of unit process in a single program for a single-core microcomputer, the single program including/processing multiple tasks by a single core, wherein the parallel program is configured to (a) parallelize the multiple tasks by parallelizing a plurality of parallelizable unit processes based on a data dependency indicative of an access to a same resource from plural unit processes of the single program, according to an analysis of the single program including the multiple tasks for the single-core microcomputer, (b) extract the plural unit processes extracted based on indication of an access to the commonly-accessed resource, by analyzing a commonly-accessed resource commonly accessed from the multiple tasks, and (c) allocate the extracted plural unit processes included in different tasks to access the same resource to the same core of the multi-core microcomputer to prevent from generating an interference among the multiple cores so that the plurality of the unit processes includes an intra-core exclusion process, the interference is caused by the same resource commonly accessed by the multiple tasks; and the intra-core exclusion process is a process that performs a task interruption prohibition, and a task interruption permission, and the multi-core microcomputer is configured to execute the parallel program by using the multiple cores to respectively execute the unit processes assigned to the multiple cores. 